Asymmetry in Corpus-Derived and Human Word Associations

نویسندگان

  • Lukas Michelbacher
  • Stefan Evert
  • Hinrich Schütze
چکیده

We investigate asymmetry in corpus-derived and human word associations. Most prior work has studied paradigmatic relations, either derived from free association norms or from large corpora using measures of statistical association and semantic relatedness. By contrast, we investigate the syntagmatic relation between words in adjective-noun and noun-noun combinations and present a new experimental design for measuring the strength of human associations. Of particular importance for syntagmatic relations are asymmetric associations, whose associational strength is much larger in one direction (e.g., from Pyrrhic to victory) than in the other (e.g., from victory to Pyrrhic). We develop a number of corpus-derived measures of asymmetric association and show that they predict the directedness of human associations with high accu-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Graph-Based Approach for Computing Free Word Associations

A graph-based algorithm is used to analyze the co-occurrences of words in the British National Corpus. It is shown that the statistical regularities detected can be exploited to predict human word associations. The corpus-derived associations are evaluated using a large test set comprising several thousand stimulus/response pairs as collected from humans. The finding is that there is a high agr...

متن کامل

Comparing predictions of lexical norm data obtained using word associations and word collocation

We compared the quality of prediction of word variables based on a Dutch word association and text corpus. We derived estimates for: valence, arousal, dominance, concreteness and age of acquisition (AoA) for 2831 words. Based on the similarity between words we: (1) used projections on a dimension identified as the variable in question in a multidimensional representation, (2) used the k-nearest...

متن کامل

Vocabulary Lists for EAP and Conversation Students

Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...

متن کامل

Do We Need Discipline-Specific Academic Word Lists? Linguistics Academic Word List (LAWL)

This corpus-based study aimed at exploring the most frequently-used academic words in linguistics and compare the wordlist with the distribution of high frequency words in Coxhead’s Academic Word List (AWL) and West’s General Service List (GSL) to examine their coverage within the linguistics corpus. To this end, a corpus of 700 linguistics research articles (LRAC), consisting of approximately ...

متن کامل

Developing a Corpus-Based Word List in Pharmacy Research ‎Articles: A Focus on Academic Culture

The present corpus-based lexical study reports the development of a Pharmacy Academic Word List (PAWL); a list of the most frequent words from a corpus of 3,458,445 tokens made up of 800 most recent pharmacy texts including research articles, review articles, and short communications in four sub-disciplines of pharmacy. WordSmith (Scott, 2017) and AntWordProfiler (Anthony, 2014) were used to sc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011